Visual Processes for Tracking and Recognition of Hand Gestures
نویسنده
چکیده
This paper describes experiments with techniques for tracking hands and recognizing gestures. Complementary techniques are presented for detecting and tracking hands and tools. These techniques are integrated within a system which uses multiple image processing techniques to estimate the position and orientation of a hand. Images of the tracked hand are normalized in orientation and position and then projected into a principal components space. Hand configurations are represented using a probabilistic classification. Gestures are recognized in this space as sequences of hand configurations using finite state machines. 1 Direct Manipulation of Objects as an Interaction Modality Human gesture serves three functional roles [4]: semiotic, ergotic, and epistemic. The semiotic function of gesture is to communicate meaningful information. The structure of a semiotic gesture is conventional and commonly results from shared cultural experience. The good-bye gesture, the American sign language, the operational gestures used to guide airplanes on the ground, and even the vulgar “finger”, each illustrates the semiotic function of gesture. The ergotic function of gesture is associated with the notion of work. It corresponds to the capacity of humans to manipulate the real world, to create artifacts, or to change the state of the environment by “direct manipulation”. Shaping pottery from clay, wiping dust, etc. result from ergotic gestures. The epistemic function of gesture allows humans to learn from the environment through tactile experience. By moving your hand over an object, you appreciate its structure, you may discover the material it is made of, as well as other properties. All three functions may be augmented using an instrument or tool. Examples include a handkerchief for the semiotic good-bye gesture, a turn-table for the ergotic shape-up gesture of pottery, or a dedicated artifact to explore the world (for example, a retro-active system such as the pantograph [13] to sense the invisible). 2 In Human Computer Interaction, gesture has been primarily exploited for its ergotic function: typing on a keyboard, moving a mouse and clicking buttons. The epistemic role of gesture has emerged effectively from pen computing and virtual reality: ergotic gestures applied to an electronic pen, to a data-glove or to a body-suit are transformed into meaningful expressions for the computer system. Special purpose interaction languages have been defined, typically 2-D pen gestures as in the Apple Newton, or 3-D hand gestures to navigate in virtual spaces or to control objects remotely [1]. With the exception of the electronic pen and the keyboard which both have their noncomputerized counterparts, mice, data-gloves, and body-suits are “artificial add-on’s” that wire the user down to the computer. They are not real end-user instruments (as a hammer would be), but convenient tricks for computer scientists to sense human gesture. Computer vision can transform ordinary artefacts and even body parts into effective input devices. Krueger’s work on the video-place [11], followed recently by Wellner and Mackay's concept of "digital desk" [18] show that the camera can be used as a non-intrusive sensor for human gesture. However, to be effective the processing behind the camera must be fast and robust. The techniques used by Krueger and Wellner are simple concept demonstrations. They are fast but fragile and work only within highly constrained environments. We are exploring the integration of appearance-based computer vision techniques to non-intrusively observe human gesture in a fast and robust manner. The techniques described in this paper are developed within the context of a digital desk. In this system, a computer screen is projected onto a physical desk using a liquid-crystal "datashow" working with standard overhead projector. A video-camera is set up to watch the workspace such that the surface of the projected image and the surface of the imaged area coincide. The workspace contains a number of physical and virtual objects which can be manipulated by a human hand. Both physical and virtual devices act as tools whose manipulation is a communication channel between the user and the computer. The identity of the object which is manipulated carries a strong semiotic message. The manner in which the object is manipulated provides both semiotic and ergotic information. Tools may be virtual objects, or any convenient physical object which has been previously presented to the system. Virtual objects are generated internally by the system and projected
منابع مشابه
Neural Network Performance Analysis for Real Time Hand Gesture Tracking Based on Hu Moment and Hybrid Features
This paper presents a comparison study between the multilayer perceptron (MLP) and radial basis function (RBF) neural networks with supervised learning and back propagation algorithm to track hand gestures. Both networks have two output classes which are hand and face. Skin is detected by a regional based algorithm in the image, and then networks are applied on video sequences frame by frame in...
متن کاملHuman Computer Interaction Using Vision-Based Hand Gesture Recognition
With the rapid emergence of 3D applications and virtual environments in computer systems; the need for a new type of interaction device arises. This is because the traditional devices such as mouse, keyboard, and joystick become inefficient and cumbersome within these virtual environments. In other words, evolution of user interfaces shapes the change in the Human-Computer Interaction (HCI). In...
متن کامل3D hand tracking using Kalman filter in depth space
Hand gestures are an important type of natural language used in many research areas such as human-computer interaction and computer vision. Hand gestures recognition requires the prior determination of the hand position through detection and tracking. One of the most efficient strategies for hand tracking is to use 2D visual information such as color and shape. However, visual-sensor-based hand...
متن کاملHuman Computer Interaction Using Vision-Based Hand Gesture Recognition
With the rapid emergence of 3D applications and virtual environments in computer systems; the need for a new type of interaction device arises. This is because the traditional devices such as mouse, keyboard, and joystick become inefficient and cumbersome within these virtual environments. In other words, evolution of user interfaces shapes the change in the Human-Computer Interaction (HCI). In...
متن کاملVirtual Document Projector Camera
This paper describes techniques for the design of a system able to interact with the user by visual recognition of hand gestures. The system is composed of three modules including tracking, posture classii-cation and gesture recognition. A description of each module is given. In order to increase the robustness and the precision of the tracking, several complementary tracking processes are coup...
متن کاملRecognition of Alphabetical Hand Gestures Using Hidden Markov Model
The use of hand gesture provides an attractive alternative to cumbersome interface devices for human-computer interaction (HCI). In particular, visual interpretation of hand gestures can help achieve easy and natural comprehension for HCI. Many methods for hand gesture recognition using visual analysis have been proposed such as syntactical analysis, neural network (NN), and hidden Markov model...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997